
Sequence length 2175 

GAC*X3CnX:€GCAAATTTCCTGATTCT^^ 

ATAGCXXTITATCACTQCCATCACCA^^ 

TTTCAAAGTCAGCAGAAAGGAAGCTC^^ 

M G W L F L K,V L L A G V S F S G 17 
CTTCACCACC ATG GQC TOG CTT TTT CTA AAG GTT TTG TTG GCX5 GGA GTG AGT TTC TCA GGA 51 

FLYPLVDFCISGKTRGQKPN 37 
• : TTT CTT TAT CCT CTT GTG GAT TTT TGC ATC AGT GGG AAA ACA AGA GGA CAG AAG CCA AAC 111 

FVIILADDMGWGDLGANWAE 57 
p TTT GTG ATT ATT TTG GCC GAT GAC ATG GOG TGG GGT GAC CTG GGA GCA AAC TGG GCA GAA 171 

TKDTANLDKMASEGMRFVDF 77 
ACA AAG GAC ACT GCC AAC CTT GAT AAG ATG GCT TCG GAG GGA ATG AQG TTT GTG GAT TTC 231 

Ul HAAASTCSPSRASLLTGRLG 97 

fB' ; CAT GCA GCT GCC TCC ACC TGC TCA CCC TCC CGG GCT TCC TTG CTC ACC GGC CGG CTT GGC 291 

mj i 
Ml ; 



LRNGVTRNFAVTSVGGL PLN 117 
CTT CGC AAT GGA GTC ACA CGC AAC TTT GCA GTC ACT TCT GTG GGA GGC CTT CCG CTC AAC 351 



^iJ ; ETTLAEVLQQAGYVTGIIGK 137 

1 GAG ACC ACC TTG GCA GAG GTG CTG CAG CAG GCG GGT TAC GTC ACT GGG ATA ATA GGC AAA 411 

UJ ' 

! WHLGHHGSYH -PNFRGFDYYF 157 

] TGG CAT CTT GGA CAC CAC GGC TCT TAT CAC CCC AAC TTC CGT GGT TTT GAT TAC TAC TTT 471 



GI PYSHDMGC TDTPGYNHPP 177 

GGA ATC CCA TAT AGC CAT GAT ATG GGC TGT ACT GAT ACT CCA GGC TAC AAC CAC CCT CCT 531 

CPAC PQGDGPSRNLQRDCYT 197 

TGT CCA GOG TGT CCA CAG GGT GAT GGA CCA TCA AGG AAC CTT CAA AGA GAC TGT TAC ACT 591 

DVAL PLYENLNIVEQPVNLS 217 

GAC GTG GCC CTC CCT CTT TAT GAA AAC CTC AAC ATT GTG GAG CAG CCG GTG AAC TTG AGC 651 

SLAQKYAEKATQFIQRASTS 237 

AGC CTT GCC CAG AAG TAT GCT GAG AAA GCA ACC CAG TTC ATC CAG CGT GCA AGC ACC AGC 711 

GRPFI.LYVALAHMHVPL PVT 257 

GGG AGG COC TTC CTG CTC TAT GTG GCT CTG GCC CAC ATG CAC GTG CCC TTA CCC GTG ACT 771 

QLPAAPRGRSLYGAGLWEMD 277 

CAG CTA CCA GCA GCG CCA COG GGC AGA AGC CTG TAT GGT GCA GGG CTC TGG GAG ATG GAC 831 

SLVGQ IKD KVDHTVKE NTFL 297 

AGT CTG GTG GGC CAG ATC AAG GAC AAA GTT GAC CAC ACA GTG AAG GAA AAC ACA TTC CTC 891 

WFTGDNGPWAQKCELAGSVG 317 

TOG TTT ACA GGA GAC AAT GGC CCG TOG GCT CAG AAG TGT GAG CTA GOG GGC AGT GTG GGT 951 

PFTGFWQTRQGGSPAKQT TW 337 

CCC TTC ACT GGA TTT TGG CAA ACT CGT CAA GGG GGA AGT OCA GCC AAG CAG ACG ACC TGG 1011 

EGGHRVPALAYWPGRVPVNV 357 



i 




GAA GGA GOG CAC CGG GTC OCA GCA CTG GCT TAC TOG CCT; GGC AGA GTT OCA QTT AAT GTC 1071 

TSTALLSVLDIFPTVV AL AQ 377 

AOC AGC ACT GCC TTG TTA AGC GTG CTG GAC ATT TTT CCA ACT GTG GTA GCC CTG GCC CAG 1131 

ASLPQGRRFDGVDVSEVLFG 397 

GCC AGC TTA CCT CAA GGA OGG CGC TTT GAT GGT GTG GAC GTC TCC GAG GTG CTC TTT GGC 1191 

RSQPGHRVLFHPNSGAAGEF 417 

CGG TCA CAG OCT GG6 CAC AGG GTG CTG TTC CAC CCC AAC AGC GGG GCA GCT GGA GAG TTT 1251 

I 

GALQTVRLERYKAFYITGGA 437 

GGA GCC CTG CAG ACT GTC CGC CTG GAG CGT TAC AAG GCC TTC TAC ATT ACC GGT GGA GCC 1311 

RACDGSTGPELQHKFPLIFN 457 

AGG GCG TGT GAT GGG AGC ACG GGG CCT GAG CTG CAG CAT AAG TTT CCT CTG ATT TTC AAC 1371 

_Q LEDDTAEAVPLERGGAEYQA 477 

^^1 CTG GAA GAC GAT ACC GCA GAA GCT GTG CCC CTA GAA AGA GGT GGT GCG GAG TAC CAG GCT 1431 



VLPEVRKVLADVLQDIANDN 497 
GTG CTG CCC GAG GTC AGA AAG GTT CTT GCA GAC GTC CTC CAA GAC ATT GCC AAC GAC AAC 1491 



SJ ; ISSADYTQDPSVTPCCNPYQ 517 

fii I ATC TCC AGC GCA GAT TAC ACT CAG GAC CCT TCA GTA ACT CCC TGC TGT AAT CCC TAC CAA 1551 



lACRCQAA* 526 
ATT GCC TGC CGC TGT CAA GCC GCA TAA 1578 



CAGACCAATTTTTATTCCACGAGGAGGAGTACCTGGAAATTAGGCAAGTTTGC^^ 
; ACAAACACACGCTTTAGTTTAGTCTTGGAGTTTAGTTTTGGAGTTAGCCT^^ 



m TCCACGCCGACCa3AGAGCAGCriGAGCTGCGCIX3GCTCTGGGC^ 
GAGTCAGGCACAGGTGCCAGCTCCAGCTTTTGAACITX3GGO 
AAATAAAGGCATACATGAAAAAAAAAAAAAAAAA 
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Prosite Pattern Matches ; 

Prosttc verEl«a: kdcase 12^ of Febraaty 1995 

>Pag0001 1 PDOCOOOOX I AStCCLYOOSYLATrOW H-alycosylaCion eite. 

Query: 117 NETT 120 

Query: 215 NLSS 218 

Query: 3S6 NVTS 359 

Query: 497 HISS 500 

>£SflfififlS|PDOC0000S|PKC_PHOSPHO„SITE Protein kinase C phosphorylation sit 



Query: 


28 


SGK 


30 


Query: 


93 


TGR 


95 


Query: 


237 


SGR 


239 


Query: 


290 


TVK 


292 


Query: 


422 


TVR 


424 




_PHOSI 


Query: 


120 


TLAE 


123 


Query: 


290 


TVKE 


293 


Query: 


335 


TTWE 


338 


Query: 


364 


SVLD 


367 


Query: 


444 


TGPE 


447 


Query: 


499 


SSAD 


502 


>psoooofi 1 Pnorooonfl |KvnTcwT 


Query: 


12 


GVSFSG 


17 


Query: 


33 


GQKPKF 


38 


Query: 


52 


GANWAE 


57 


Query: 


97 


GLRNGV 


102 


Query: 


113 




118 


Query: 


158 


GIPYSH 


163 


Query: 


328 


OGSPAK 


333 


Query: 


388 


GVDVSE 


393 


Query: 


418 


GAljQTV 


423 


Query: 


435 


OGARAC 


440 


>r30000^ 1 POOC00009 1 AKIOATZON 


Query: 


382 


QGRR 


385 



>£Sflfii42lPOOC00117|SULFATASEL_2 Sulfatases signature 2. 
Query: 129 GYVTGIIGKW 138 



m 



Input: file Fbh23553f l.seq; Output File 23553. trans 

Sequence length 4321 .1 
OOCAOGOnXXXXSCTAATGWlT^^ 

GAGGAACATGACTCTXSCCCXrrrOOGAC^^ 
TTCCCAGAGCTTTTTCTOT 
AAACTTQQCAAATCACATQCACSGTTCTT^ 
CTGAATACCKrrGAGAATAGAGATTXSA 

MKYSCCALVLA 11 

CATTTTGTCAGTTTTGCAACAT1XX5^ ATG AAG TAT TCT TQC TGT GCT CTC GTT TTG GCT 33 

} VLGTELLGSLCSTVRSPRFR 
4J GTC CTG GQC ACA GAA TTG CTG GGA AC3C CTC TGT TCG ACT GTC AGA TCC CCG AGG TTC AGA 

4^ . 

^1 G RIQQERKNIRPNIILV LTD 51 

^1 I GGA OGG ATA GAG CAG GAA CGA AAA AAC ATC OCA OCC AAC ATT ATT CTT GTG CTT ACC GAT 153 

\ R 2. DVELGSLQVMNKTRKIME 71 

GAT CAA GAT GTG GAG CTG GGG TCC CTG CAA GTC ATG AAC AAA ACG AGA AAG ATT ATC GAA 213 



m ) 



31 
93 



^ j HGGATFINAFVTTPM C C P S R 91 

Q CAT GGG GOG GCC ACC TTC ATC AAT GCC TTT GTC ACT ACA CCC ATG TCC TGC CCG TCA CGG 273 

S S M L T G K Y V H N H NVYTNNEN 111 

TCC TCC ATC CTC ACC GGG AAG TAT GTC CAC AAT CAC AAT GTC TAC ACC AAC AAC GAG AAC 333 



CSSPSWQAMHEPRTFAV YLN 131 

TGC TCT TCC CCC TOG TGG CAG GCC ATC CAT GAG CCT OGG ACT TTT GCT GTA TAT CTT AAC 393 

N T G Y R T A F F G K Y L N E Y N G S Y 151 

AAC ACT GGC TAC AGA ACA GCC TTT TTT GGA AAA TAC CTC AAT GAA TAT AAT GGO AGO TAC 453 

I P P G W REWLGLIKNSR F Y N Y 171 



ATC CCC CCT GGG TGG OGA GAA TGG CTT GGA TTA ATC AAG AAT TCT CGC TTC TAT AAT TAC 513 

_TVORNGIKEKHGFDYA K D Y F 191 

ACT GTT TGT OGC AAT GGC ATC AAA GAA AAG CAT GGA TTT GAT TAT GCA AAG GAC TAC TTC 573 

T D L ITNESINYFKMSKRMYP 211 



ACA GAC TTA ATC ACT AAC GAG AGC ATT AAT TAC TTC AAA ATC TOT AAG AGA ATC TAT CCC 633 

HRPVMMVISHA A P H G P E D S A 231 

CAT AGG OOC GTT ATC ATC GTC ATC AGC CAC GCT GOG COO CAC GGC COC GAG GAC TCA GCC 693 

PQFSKLYPNASQHITPSYNY 251 

CCA CAG TTT TCT AAA CTC TAC COC AAT GCT TOO CAA CAC ATA ACT OCT AGT TAT AAC TAT 753 

APNMDKHWIMQYTGPMLPIH 271 

GCA OCA AAT ATC GAT AAA CAC TOG ATT ATC CAG TAC ACA GGA CCA ATC CTC COC ATC CAC 813 

MEFTNILQRK R L Q T L M S V D D 291 

ATC GAA TTT ACA AAC ATT OTA CAG CGC AAA AGG CTC CAG ACT TTC ATC TCA GTC GAT GAT 873 

SVERLYNML VETGELENTY I 311 



FIG'S C-lO 



TCT GTG GAG AGG CTG TAT AAC ATC CTC GTG GAG ACG GGG GAG CTC GAG AAT' ACT TAC ATC 933 



H G y H 



— ^ - - ^ - - " V K G K S 331 

ATT TAC AOC GCC GAC CT^T GGT TAC CAT ATT GGG CyVG TTT GGA CTC GTC AAG G^ 993 



M 



D I 



F 



R G 



331 



- --^wi^vj:::*^ 351 

ATG CCA TAT GAC TTT GAT ATT COT GTG CCT TTT TTT ATT CGT GCM' CCA AGT Cn-A GAA ^ 1053 



^^^VPQIVLNIDLAPTILDI 371 
GGA TCA ATA GTC CCA CAG ATC GTT CTC AAC ATT GAC TTG GCC CCC ACG ATC CTC GAT ATT 1113 



CI 

m. 
ill 

hi 



si ! ■ 



A ^G L 



D G K 



EKPGNRFRTNKKAKIWRDTF 
GAA AAG CCA GGT AAC AGG TTT CGA ACA AAC AAG AAG GCC AAA ATT TX3G CGT GAT ACA TTC 

LVERGKFLRKKEESSKNIQQ 
CTA GTG GAA AGA GGC AAA TTT CTA CGT AAG AAG GAA GAA TCC AGC AAG AAT ATC CAA CAG 

SNHLPKYERVKELCQQARYQ 
TCA AAT CAC TTG CCC AAA TAT GAA CGG GTC AAA GAA CTA TCC CAG CAG GCC AGG TAC CAG 

TACEQPGQKWQCIEDTSGKL 
ACA GCC TGT GAA CAA CCG GGG CAG AAG TGG CAA TGC ATT GAG GAT ACA TCT GGC AAG CTT 

RIHKCKGPSDLLTVRQSTRN 
CGA ATT CAC AAG TGT AAA GGA CCC AGT GAC CTC CTC ACA GTC CGG CAG AGC ACG CGG AAC 

LYARGFHDKDKECSCRESGY 
CTC TAC GCT OC3C GGC TTC CAT GAC AAA GAC AAA GAG TCC AGT TCT AGG GAG TCT GGT TAC 

^ASRSQRKSQRQFLRNQGTP 
CGT GCC AGC AGA AGC CAA AGA AAG AGT CAA CGG CAA TTC TTC AGA AAC CAG GGG ACT CCA 

KYKPRFVHTRQTRSLSVEFE 
AAG TAC AAG CCC AGA TTT GTC CAT ACT CGG CAG ACA CGT TCC TTC TCC GTC GAA TTT GAA 



- — - — . *x jj j_/ 391 

GCT GGG CTC GAC ACA CCT CCT GAT GTC GAC GGC AAG TCT GTC CTC AAA CTT CTC GAC CCA 1173 



411 
1233 

431 
1293 

451 
1353 

471 

1413 

491 
1473 

511 
1533 

531 
1593 

551 
1653 



GEIYDINLEEEEELQVLQPR 
GGT GAA ATA TAT GAC ATA AAT CTC GAA GAA GAA GAA GAA TTC CAA GTC TTC CAA CCA AGA 



571 
1713 



NIAKRHDEGHKGPRDLQASS 
AAC ATT GCT AAG CGT CAT GAT GAA GGC CAC AAG GGG CCA AGA GAT CTC CAG GCT TCC AGT 



591 
1773 



GGNRGRMLADSSNAVGPPTT 
GGT GGC AAC AGG GGC AGG ATC CTC GCA GAT AGC AGC AAC GCC GTC GGC CCA CCT ACC ACT 



611 
1833 



VRVTHKCFILPNDSIHCERE 
GTC OGA GTC ACA CAC AAG TGT TTT ATT CTT CCC AAT GAC TCT ATC CAT TCT GAG AGA GAA 



631 
1893 



W 



K 



K 



K 



CTC TAC CAA TCG GCC AGA GCG TGG AAG GAC CAT AAG GCA TAC ATT GAC AAA GAG ATT GAA 



651 
1953 



V 



R 



GCT CTC CAA GAT AAA ATT AAG AAT TTA AGA GAA GTC AGA GGA CAT CTC AAG AGA AGG AAG 



671 
2013 



PEECSCSKQSYYNKEKGVKK 691 
CCT GAG GAA TGT AGC TGC AGT AAA CAA AGC TAT TAC AAT AAA GAG AAA GGT GTA AAA AAG 2073 



QEKLKSHLHPFKEAAQEVDS 711 
CAA GAG AAA TTA AAG AGC CAT CTT CAC CCA TTC AAG GAG GCT GCT CAG GAA GTA GAT AGC 2133 



K L Q L F K E N N R R R R K E R k""" E 



RQRKOEECSI. P G L T C F T H n „ 
OGGCAGAGGAAGGQGGAAGAGTGCAOCCTG CCTOGCCnCitfrrTOCTICAa^ 2^ 

^ <^ ^ A P F W NI.GSFCACn.co 

AACCACTGGCAGACAGa:O0GTTCTGGAACCaGGGATCT T1X:tX3TGCTTGCA^A^ ^ 2^3 



_N ^ NTYWCLRTVNETH 



AAC AAT AAC AOC TAG TGG TGT arc OGT ACA OCT j Ut GAG ACG CAT AAT C^ jI^J 

G«i TTT GCT ACT 1^ VIT TiU GAG TAT TTT GAT ATC AAT ACA GAT CCT TAT cL C^ aL 2«J 

NTVHTVERGILNQLHVOLMP 
AAT ACA GTG CAC ACG GTA GAA CGA GGC ATT TiU AAT CAG CTA CAC GTA CAA CTA ATO G^ G 2«J 

£__£ Q GYKQCNPR P K N L D V G 

CTC AGA AGC TCT CAA GUA TAT AAG CAG TGC AAC CCA AGA CCT AAG AAT CTT GAT GTT gL 2^5^ 

^ ° G S Y DLHRGQLWDGWEG fi7l 

AAT AAA GAT GGA GGA AGC TAT GAC CTA CAC AGA GGA CAG TTA TOG GAT GGA TCG gL 2!!^ 

TAA 872 

2616 

TCAGCOCaSTCTCACTGCAGACATCAACTGGCAAGGOCTAGAGGAGCTACACAGTCTCAATCAAAAC^ 

AGACAAAACTACAGACTTAGTCTQGTGGACTGGACTAATTACTTGAAGGATITAGATAGAGTATnxxrAC 

GTCACTATGAGCAAAATAAAACAAATAAGACTCAAACTGCTCAAAGTCACGGGlTXnTC^^ 

CCAGCTGACCTTCAAAOOCTGCATTTGAACCGAOCAACATTAAGTOaMSAGAGTAAACTTCA^ 

CAGAAGTrAATCATTTGAATTCTGAACACTQGAGAAAAAOOGAAAAATOGACGGGGCATCAAGAGACT^ 

AACOGATrTCAGTGGOGATGGCATGACAGAGCTAGAGCTOGGGOCCAGCCCCAGGCTOCAGCCCATTCGCA^ 

AAAGAACTTOOOCAGTATGGTGGKXnX3GAAAQGACATTTTTGAAGATO«CTAT 

TrTCAGTlxaiTCAGATGTTCACCATQGCCACCGCAGAACAOOGAAGrAATl«»Q^^^ 

AAACTCOTAOCTTAOCXrrAAACACAGTATTTCTTTTTAACTXTTTXATTTCTAAACTAATAA^ 

AACATTGCAAGCTAOOCTGGCTAOCTTTCTGCAGTAGAAGCTAGTGflOCATCTOAGCAAGOGC^^ 

CATOGTTATAATTTACTATCTGOCAAOGAGTAGAAAGAAAGQCTQGQGATATTTOGGTIGGCTriGGKTTTCAT^ 

CSCTTOGTTGGTTGGTrTGKACTAAAACAGTATTATCTTITOAATATOGTAGGGACATAARK^^ 

™RAI«I«SYWRRAWKGOGSTYTYTSKKRKSTI^^ 

KTAATGAAGTT 



S^nalysis of 23553 (871 aa) 



4457 2<SMaS 1 33701 41 ^SgTEWfG 



12812115291*3 l245lfl5a2*F€ 155 
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• Chou-Fasman 
■ Gamier-Robson 
Chou-Fasman 
Gamier-Robson 



H Hydrophilicity Plot - Kyte-Doolittle 
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Prosite Pattern Matches for 23553 

ProsSte verdoa: Release 12.Z of Fcbrtiafy 1995 

>£SfimMa(PDCXH)0001|ASri_CLYCOSVXATIOM N-«lycosylation site. 



Query: 64 


NKTR 


67 


Query 


1X1 


NCSS 


114 


Query 


131 


NtTTG 


134 


Query: 148 


NGSY 


151 


Query 


170 


NYTV 


173 


Query: 


197 


MESI 


200 


Query: 


240 


NASQ 


243 


Query: 


623 


KDSI 


626 


Query: 


773 


NNTY 


776 


Query: 


783 


NETH 


786 


>£Sflflfifi5lPDOC00005|PKC_PHOSPHO_SITE Protein kinase 


Query: 


24 


TVR 


26 


Query: 


27 


SPR 


29 


Query: 


66 


TRK 


68 


Query: 


96 


TGK 


98 


Query: 


206 


SKR 


208 


Query: 


400 


TNK 


402 


Query: 


425 


SSK 


427 


Query: 


468 


SGK 


470 


Query: 


484 


TVR 


486 


Query: 


488 


STR 


490 


Query: 


SOS 


SCR 


507 


Query: 


516 


SQR 


518 


Query: 


520 


SQR 


522 


Query: 


530 


TPK 


532 


Query: 


611 


TVR 


613 


Query: 


615 


THK 


617 


Query; 


635 


SAR 


637 


>PSO0G05 1 PDOCOOOOfi |rK7 


PHOSPHO_SITE Casein kinase II 


Query: 


107 


TNKE 


110 


Query: 


268 


SVDD 


291 


Query: 


367 


TILO 


370 


Query: 


376 


TPPD 


379 


Query: 


452 


TACE 


455 


Query: 


SOS 


SCRE 


508 


Query: 


781 


TVNE 


784 



FiG- ^ ay 




> psoo0O7 1 POOC00007 |TYR^PHOSPHO_SITE Tyrosine kinase ^Phosphorylation site. 

Query: 637 RAHKDHKAY 64S 

> pgQ0Q08 I PPOC00008 IMYRISTYL N-myristoylation site. 



Query: 


19 


GSCCST 


24 


Query: 


161 


GLIKNS 


166 


Query: 


325 


GLVKGK 


330 


Query: 


592 


GGNRGR 


597 


Query: 


763 


GSFCAC 


768 


Query: 


851 


CNKDGG 


856 



£| > PS00523 I PDOC00117 I SUL,FATASE_1 Sulfatases signature 1. 

sl^^ Query: 85 PhfCCPSRSSltt»TG 97 



' i 
I 

i 



PI 

Hi 
fil 
4 ■ 

CI 



Breast N I? 



Breast N 



Breast T 



Breast T 



lamgN 



LungT 



LungT 



l-ungT 



Colon N 



CotonT 



CoionT 



Liver 



Liver 



UverN 





cn 

O 

O 

3 

O 
(Q 

B 

Q 
3 



J' 
111 



Input file Ftoh25278FX,l,seq; Output File 25278. trans - 
Sequence length 2940 -1^ 



CXXX3QGCCGGC 

G Y L 
GGC TAG CTG 

G E Q 
GGC GAG GAG 

D Q G 
GAC CAA GGC 

R L A 

AQG CTG GCG 

S R S 



'^"TLTGFSLVSLLSF 
TTGGCG ATG CAC AOC CTC ACT GGC TTC TOT CTC GTC AGC CTC OT: AGC -ITC 

SWDWAKPSFVADGPGEA 
TCCTGGGACTGGGOCAAGCOGAGCTTCGTCGCCGACGGGCCCGGGGAGGCT 

OCC TCG GOC GCT CCG CCC CAG CCT CCC CAC ATC ATC TIC ATC CTC ACG GAC 

YHDVGYHGSDIETPTLD 
TAC CAC GAC GTG GGC TAC CAT GGT TCA GAT ATC GAG ACC CCT ACG CTG GAC 



K 



K 



N 



GCC AAG GGG GTC AAG TTG GAG AAT TAT TAC ATC CAG CCC ATC TCC ACG CCT 

Q 



^ ^ CAG CTC CTC ACT GGC A GG TAC CAG ATC CAC ACA GGA CTC CAG CAT TCC ATC 

^^PQQPNCLPLDQVTLPOKC 
ATC OGC CCA CAG CAG CCC AAC TGC CTC CCC CTC GAC CAG GIG ACA CTC CCA CAG AAG CTC 

Q E A G Y S T H M V G K W H L G F Y R K 
CAG GAG GCA GGT TAT TCC ACC CAT ATC GTC GGC AAG TCG CAC CTC Gcb TTC TAC CGG AAG 



N 



GAG TGT CTC OCC ACC CGT CGG GGC TTC GAC ACX: TTC CTC G^ O^G C^ AOT G^ A^T gL 

DYYTYDNCDGPGVCGFDLH E 
GAC TAT TAC ACC TAT GAC AAC TC5T GAT GGC CCA GGC GTC TCC GGC rrc GAC CTC CAC GAG 

GENVAWGLSGQYSTMLYAQR 
GGT GAG AAT GTC GCC TGG GGG CTC AGC GGC CAG TAC TCC ACT ATC cri TAC GCC CAG CGC 



S 



H 



H 



GOCAGCCATATCCTCGCCAGCCACAGCCCTCAGCGT CCCCTC TTCCTCTAT GTCGCCTTC 

QAVHTPLQSPREYLYRYRTM 
CAG GCA GTA CAC ACA CCC CTC CAG TCC CCT CGT GAG TAC CTC TAC CGC TAC CGC ACC ATC 



K 



M 



M 



GGC AAT GTC GCC OGG CGG AAG TAC GCG GCC ATC GTC ACC TCC ATC GAT GAG GCT GTC CGC 



N 



N 



W 



AGTGACAATGGTGGCCAGACTTTCTOGGGGGGCAGC AACTOGCCGCTCOGAGGACGCAAG 



W 



15 
45 

35 
105 

55 
165 

75 
225 

95 
285 

115 
345 

135 
405 

155 
465 

175 
525 

195 
585 

215 
645 

235 
705 

255 
765 

275 
825 



NITWALKRYGFYNNSVI'IFS 295 
AAC ATC ACC TGG GCC CTC AAG CGC TAC GCT TTC TAC AAC AAC ACT GTC ATC ATC TTC TCC 885 



315 
945 



^ ^ V ^ ^^t VHSPLLK 335 

GGC ACT TAT TGG GAA GCT GGC GTC CGG GGC CTA GGC TTT GTC CAC ACT CCC CTC CTC AAG 1005 



PlG'\0 Q- 1 



T S R A L M H I T D W Y P - - T , „ 
OOV AAG CAA ACA AGC OC» GCA CTO ATG CAC ATC ACT GAC IXXS TAC 

GLAGGTTS AADGLDGY DVWP 
OGTCTGGCAOGTOOTAOCAOCTCAGCAGCCGATGGGCTAGATGOCTACGACGTCTOGcSs iSs 



H N 



GOC ATC AGC GAG OGC CGG GCC OCA CCA OQC AOG GAG ATC eTv. c;;c A^ A^ gXc O^ llH 

Y NHAQHGSLEGGFGIWKTAV ^1^ 
TACAACCATGOCCAGCATGGCTCCCTGGAGGGCQGCTITGQCATCTOGAACACCGCCGTC 12« 



w 



caggctgccatcoqcgtgggtgag tgg aag ctg CTC a^ g^ c^ tIt g^ g^t itll 

WIPPQTLATFP GSWWNLERM .1.;^ 

TOGATCOCAOCGCAGACACrGGOCAOCTTCOCGGGTAGCTOGTCGAACCTCGAACGAATC Jel 

ASVRQAVWDFNISADPYERE /I7.; 

4^ «=c*^gtccgccaggocgtotggctcttcaacatcactgctgacccttatgaacgggL 1J25 

m ! ^^AGQRPDVVRTL LARLAEY HO'^ 

^ <^<^«^«3CCAGOGGOCTGATGTGGTCOQCACCCTCCTCGCTCGCCTCGCCGAATAT 1485 

Pl! ''^'^^^PVRYPAENPRAHPDF SIS 

I *^ GCC ATC COG GTA CGC TAC CCA GCT GAG AAC CCC OGG GCT CAT CCT GAC TTT 1545 

SI I ^ ° G A W G P » ASDEEEEEEEGR 535 

-^"^ «^ <^ «^ -"^G GGG CCC TGG GCC AGT GAT GAG GAA GAG GAG GAA GAG GAA GGG AGG 1605 

'^^^SFSRGRRKKKCKICKLRS 555 

GCTCGAAGCTTCTCCOGGGGTOGTOGCAAGAAAAAATCCAAGATTTCCAAGCTrCGATCC 1665 



p. 

SI FFRKLNTRLMSQRI* 
TTT TTC CGT AAA CTC AAC ACC AGG CTA ATC TCC CAA CGG ATC TGA 

TGGTGGGGAGGGAGAAAACTGrCCTTTAGAGGATCTTOCOCACTCOGGCTOXXXarr^^ 
GTCACATCTOCATCTACAGQGAGTraSAGQGTGTAGAGTCOCTTGGTlXSAACAGGGTAGGGAGCCW^ 
; *''°°°*ATAAACCAGACTQQGATGCCTGTGTCTCAGTCCTGOCTOCTCAaX3ACTTC^^ 
■*^''=*«^™"*0°CTCAGTTTCCTCATCTG^ 

•"SQOOa^OGGTOGAGTTOCTGGOCGGCCTTQOCACTTGACAAC^^ 
■"'C^^^QGTCAGACTQGTGTCAGGCAOCGTGGTGCAAAATTCCTCTlXn^^ 

OQCCATXAACTOCKXa«XyVCCAAGGGTQGTAGAAAGAGCTGTGAAGAGCCCOCAAAOCAGTAOC^^ 

CTCCTGTGAOCTOOGQCACAGTTCTTQCCCTCTAGGCCTTGATTTOQCCACCTCCAAGTCGGGATOOCAGOCC^ 

TOCCTCCTICATGAGaCTCTaGAAGACKXXXAAGGTTGTOGAaGAOCTlX?^^ 

AAAAAAAAAAAAAAAAAAAAAAGGOCOS 



570 
1710 
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Analysis of 25278 (569 aa) 



^"^^"^ ^^^^^""""'^"'"7704" 64324 141687 4020322261 23333227 l^SS??:? 
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a Alpha, Regions - Gamier-Robson 

a Beta, Regions - Gamier-Robson 

* u.ia, RcKjions - • nou-Fasmon 
DTum, Regions - GamierrRobson 
m Turn. Regions - C iiou-Fasnian 
a Coil. Regions - Gamier-Robson 

13 Hydrophilicity Plot - Kyte-Doolitlle 



Mtti MIIIBH milllHI ■■! ■ 



Alpha. Amphipathic Regions - Eisenberg 
a Beta, Amphipathic Regions - Eisenberg 
□ Flexible Regions - Karplus-Schuiz 



ca Antigenic Index - Jameson-Wolf 



□ Surface Probability Plot - Emini 
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Prostte Pattern Matches for ^Jg^S 

iVMlCc «ec«foa: Release tU mf Febmry if 

» HiO<Mni IroocOOOOl I ASHJCLYOOSYIATIOM N^lycosylACion sice. 

««enrt 276 ftixxw 279 
Ouerys 288 hnsv 291 
Query: 466 HiSA 469 

Query: 496 krta 499 

>£Sflfl41M|POOC00004|CAMP.PHOSPHO_SITE cAMP- and cGKP-<lependecvt procein kinase phosphorylation site. 
Query: 314 RRCr 317 

>FS090Q5 I PDOCOOOOS I PKC_PHOSPHO_srrE Protein kinase C phosphorylation site. 



Query: 


102 


TGR 


104 


Query: 


160 


TRR 


162 


Query: 


244 


SPR 


246 


Query: 

n 

Quea^: 


340 


TSR 


342 


383 


SPR 


385 


Quex^: 


457 


SVR 


459 




S66 


SQR 


568 


iilfil PDOC00006 1 CK2. 


_PHOSE 


. fU. 


67 


SOIE 


70 


h 


244 


SPRE 


247 


268 


TCMD 


271 




3X7 


TYWE 


320 


< ■ ki\ 


363 


SAAD 


366 


<■ l:^H^ 


525 


SDEE 


528 



||'jji2|PDOC00007|TYR_PHOSPHO_srTE Tyrosine kinase phosphorylation site. 
134 KU5EAGY 140 

fcfi.|POOC00008|MYRISTYX« N-nyristoylation site. 

I 



Quel ; -i 
Query: 


110 
169 


GLQHSI 
GSLTGN 


115 
174 


Query: 


205 


GQYSTK 


210 


Query: 


300 


GQTFSG 


305 


Query: 


321 


OGVRGL 


326 


Query: 


356 


GIAOGT 


361 


Query: 


402 


GSLBGG 


407 


Query: 


409 


GZWNTA 


414 


Query; 


447 


GSVrWKL 


452 



»P£g0009 1 PDOC00009 1 AMIDATION Amidation site. 

Query: 312 RCRK 315 

Query: S41 RCRR S44 

> rS091<? I POOC00117 I SULFATASE_2 SulCatases signature 2. 
Query: 139 GYSTHKVGKW 148 

». psons73 lPPOCOOll7jstm?ATAse l Sulfatases sionature 1. 
Query: 91 PICTPSRSQCLTC 103 
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Input file 26212cons; Output File 26212pat -1. . 
Sequence length 2266 

CAOGCXmx:GCCCACGOGTCCGTGGAGATAtTAACTTT^ 

GGGGAGGAGGAQGAGAAAGTGAAATGTGCTGGAGAAGAGCGAGCCCTCCT^^ 

ATCACTTCTGGAAGATTAAAGTTGTCGGACATGGTGACAGC^ 

TCACCXnXiriXSTTGGGTGCATGTGTGCGCCCGCA^ 

MAPRGCAGHPPPPSPQAC 18 

TGAGTGA ATG GOT CXX: AGG GGC TGT GCG GGG CAT CCG CCT CCX3 CCT TCT CCA CAG GCC TGT 54 

VCPGKMLAMGALAGFWILCL 38 

GTC TGT CCT GGA AAG ATG CTA GCA ATG GGG GCG CTG GCA GGA TTC TGG ATC CTC TGC CTC 114 

LTYGYLSWGQALEEEEEGAL 58 

CTC ACT TAT GGT TAC CTG TCC TGG GGC CAG GCC TTA GAA GAG GAG GAA GAA GGG GCC TTA 174 

J^LAQAGEKLE PSTTSTSQPHL 78 

7ipTA GCT CAA GCT GGA GAG AAA CTA GAG CCC AGC ACA ACT TCC ACC TCC CAG CCC CAT CTC 234 

^iflF ILADDQGFRDVGYHGSEI 98 

||j^TT TTC ATC CTA GCG GAT GAT CAG GGA TTT AGA GAT GTG GGT TAC CAC GGA TCT GAG ATT 294 

p^l'TPTLDKLAAEGVK LENYYV 118 

'^U A ACA CCT ACT CTT GAC AAG CTC GCT GCC GAA GGA GTT AAA CTG GAG AAC TAC TAT GTC 354 

k\ : 

I P I C T P S R S Q F I T G K Y Q I H T 138 

'tr-M G CCT ATT TGC ACA CCA TCC AGG AGT CAG TTT ATT ACT GGA AAG TAT CAG ATA CAC ACC 414 

L Q H SIIRPTQPNCL PLDNA 158 

y.J A CTT CAA CAT TCT ATC ATA AGA CCT ACC CAA CCC AAC TGT TTA CCT CTG GAC AAT GCC 474 

5v Ilpqklk EV GY STHMVGKWH 178 

li? C CTA CCT CAG AAA CTG AAG GAG GTT GGA TAT TCA ACG CAT ATG GTC GGA AAA TGG CAC 534 

G F Y R K E C MPTRRGFDT FFG 198 

G GGT TTT TAC AGA AAA GAA TGC ATG Cl!C ACC AUA AGA (MA Ti'T GAT ACC TTT TTT GGT 594 

LLGSGDYYTHYKCDSPGMC 218 

C CTT TTG GGA AGT GGG GAT TAC TAT ACA CAC TAC AAA TGT GAC AGT CCT GGG ATG TGT 654 

YDLYENDNAAWDYDNGIYS 238 

C TAT GAC TTG TAT GAA AAC GAC AAT GCT GCC TGG GAC TAT GAC AAT GGC ATA TAC TCC 714 

"^r QMYTQRVQQILASHNPTKP 258 

ACA CAG ATG TAC ACT CAG AGA GTA CAG CAA ATC TTA GCT TCC CAT AAC CCC ACA AAG CCT 774 

IFLYXAYQAVHS PLQAPGRY 278 

ATA TTT TTA TAT ATT GCC TAT CAA GCT GTT CAT TCA CCA CTG CAA GCT CCT GGC AGG TAT 834 

FEHYRSI ININRRRYAAM LS 298 

TTC GAA CAC TAC CGA TCC ATT ATC AAC ATA AAC AGG AGG AGA TAT GCT GCC ATG CTT TCC 894 

CLDEAINNVTLALKTYGFYN 318 

TGC TTA GAT GAA GCA ATC AAC AAC GTG ACA TTG GCT CTA AAG ACT TAT GGT TTC TAT AAC 954 

NSIIIYSS D N G G Q P T A G G S ^ 338 

AAC AGC ATT ATC ATT TAC TCT TCA GAT AAT GGT GGC CAG CCT ACG GCA GGA GGG AGT AAC 1014 

W P L R G S K G T Y W E G G X R A V G F 358 

TGG CCT CTC AGA GGT AGC AAA GGA ACA TAT TGG GAA GGA GGG ATC CGG GCT GTA GGC TTT 1074 




t^lOr 15 CO 



H S 



K 



GTG CAT AGC CCA CTT CTG AAA AAC AAG GGA ACA GTC TCT AAG GAA CTT G^is cSc ATC ACT 

DWYPTLlSLA EGQIDPnT^ , 
GAC TGG TAG COC ACT CTC ATT TCA CTG GCT GAA GGA CAG ATT GAT GAG GAC aJt cL cJa 

° ° ^ ° ^ " E T I S EGLRSPRvr,^ 
GAT GGC TAT GAT ATC TGG GAG ACC ATA AGT GAG GGT CTT OGC TCA CCC CGA GTA gIt aJt 

'i-X-« cJt aL aJt GAC C^ A^A tIc A Jc aJg gJa aL aJt G^ T^ tSs gJa gJa G^ tIt 



w 



N 



GGG ATC TGG AAC ACT GCA ATC CAG T^A G^C ATC AGA GTC CAG C^C tL aL^ T^ C^T aL 

-S.NPGYSDWVPPQSFSNI n r, 
GGA AAT CCT GGC TAC AGC GAC TGG GTC CCC CCT CAG TCT TTC AGC AAC cJ^ gSa C^ aJc 



R 

CGG 



CI 

iACA 

m 
rij 



WHNER ITSSTGKSVWT 
TGG CAC AAT GAA CGG ATC ACC TCXS TCA ACT GGC AAA AGT GTA TGG C^^ A^C ATC 

ADPYERVDLSNRYPGTvir 
GCC GAC CCA TAT GAG AGG GTG GAC CTA TCT AAC AGG TAT CCA GGA ATC GTG A^G A^G 

J^RRLSQFNKTAVPVRYPPir 
CTA CGG AGG CTC TCA CAG TTC AAC AAA ACT GCA GTC CCG GTC AGG TAT CCC C^C aL 

^RSNPRLNGGV* 
CCC AGA AGT AAC CCT AGG CTC AAT GGA GGG GTC TAG 

^CATGGTATAGAGAGGAAACCAAGAAAAAGAAGCCAAGCAAAAATCAGGCTCAGAAA^ 
^^GAAGAAGAAACAGCAGAAAGCAGTCTCAGGTTCAACTTGCCATTCAGGTCTTACTTCTGGATAAGCACA^^ 
PGTTTGGTTAAACTTTAATCAGTTCTTATCTTTCATCTCTTTCCTAGGTAAACCAGCAAATT^ 
f:\ ,rGGCCTAAGCGTCAGGCTTCTTTTCATCCTCTGCCACCTGGTGCCGAATTC 

m ' 



378 
1134 

398 
1194 

418 
1254 

438 
1314 

458 
1374 

478 
1434 

498 
1494 

518 
1554 

538 
1614 

551 
1653 
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3. 




H Alpha, Regions - Gamler-Robson 

■ Alpha. Regions - Chou-Fasman 
B Beta. Regions - Gamier-Robson 

■ Beta. Regions - Chou-Fasman 
HTum. Regions - Gannier-Robson 

■ Turn, Regions - Chou-Fasman 
a Coil. Regions - Gamier-Robson 

■ Hydrophilicity Plot - Kyte-Doolittle 

■ Alpha. Amphipathic Regions - Eisenberg 

■ Beta, Amphipathic Regions - Eisenberg 
H Flexible Regions - Karplus-Schulz 



Q Antigenic Index - Jameson-Wolf 



a Surface Probability Plot - Emini 
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Jf rosife i'attem JV^Kies for 26212prot 



Proslte 


version: 


Release 12.2 of Febraacy 1995 


>£S00001 1 PDOCOOOOl 1 ASN_GLYCOSYrATION N-glycosylation 


Query: 


157 


NATIi 


160 


Query: 


306 


NVTL 


309 


Query: 


318 


NNSI 


321 


Query: 


431 


NGSW 


434 


Query: 


497 


NITA 


500 


Query: 


527 


NKTA 


530 



|pMflfl5.(PDOC00004|CAMP_PHOSPHO_SITE cAMP- and cGMP-dependent protein kinase phosphorylat 



■ ^'Ptryz 521 



RRIiS 



524 



m 

m 



hi 



00005|PDOC00005|PKC_PHOSPHO_SITE Protein kinase C phosphorylation site. 



ry: 



131 


TGK 


133 


189 


TRR 


191 


243 


TQR 


245 


413 


SPR 


415 


489 


TGK 


491 


509 


SNR 


511 



^PSOOOOe I PDOC00006 |CK2_PHOSPHO_SITE Casein kinase II phosphorylation site. 



Query: 


298 


SCLD 


301 


Query: 


347 


TYWE 


350 


Query: 


386 


SLAB 


389 


Query: 


406 


TISE 


409 



>PS00OP7l PDOC00007 |TYR_PHOSPHO_SITE Tyrosine kinase phosphorylation site. 
Query: 163 KLKEVGY 169 

>PS00008 I PbQCQQOOa iMyRTfiTYT, N-myristoylation site. 



Query: 


28 


GAIiAGF 


33 


Query: 


56 


GALliAQ 


61 


Query: 


139 


GliQHSI 


144 




m 







GSLIiGS 


203 


\^u.c^ jr * 


235 


GIYSTQ 


240 




329 


GGQPTA 


334 


Query: 


343 


GS KGTY 


348 


Query: 


351 


GGIRAV 


356 


Query: 


432 


GSWAAG 


437 


Query: 


439 


GIWNTA 


444 



> PS00149 I PDOC00117 | SUIiFATASE_2 Sulfatases signature 2 
Query: 168 GYSTHMVGKW 177 

> PS00523 I PPOC00117 | SUL,FATASE_1 Sulfatases signature 1 
Query: 120 PICTPSRSQFITG 132 



